A Novel Index Supporting High Volume Data Warehouse Insertion
نویسندگان
چکیده
While the desire to support fast, ad hoc query processing for large data warehouses has motivated the recent introduction of many new indexing structures, with a few notable exceptions (namely, the LSM-Tree [4] and the Stepped Merge Method [1]) little attention has been given to developing new indexing schemes that allow fast insertions. Since additions to a large warehouse may number in the millions per day, indices that require a disk seek (or even a significant fraction of a seek) per insertion are not acceptable. In this paper, we offer an alternative to the B+-tree called the Y-tree for indexing huge warehouses having frequent insertions. The Y-tree is a new indexing structure supporting both point and range queries over a single attribute, with retrieval performance comparable to the B+-tree. For processing insertions, however, the Y-tree may exhibit a speedup of 100 times over batched insertions into a B+-tree.
منابع مشابه
Worcester Polytechnic Institute
A data warehouse typically differs from an OLTP database in terms of both significantly larger sizes for data pages as well as in the volume of data inserted in bulk. The traditional B+ Tree and its variants, while still a popular candidate for supporting point and range queries, can become very memory intensive for insert and delete operations under these more stringent requirements. Since typ...
متن کاملNovel Techniques for Data Warehousing and Online Analytical Processing in Emerging Applications
A data warehouse is a collection of data for supporting of decision making process. Data cubes and on-line analytical processing(OLAP) have become very popular techniques to help users analyze data in a warehouse. Even though previous studies on a data warehouse and data cube have been proposed and developed, as new applications emerging, there are still technical challenges which have not been...
متن کاملImprovement of the Analytical Queries Response Time in Real-Time Data Warehouse using Materialized Views Concatenation
A real-time data warehouse is a collection of recent and hierarchical data that is used for managers’ decision-making by creating online analytical queries. The volume of data collected from data sources and entered into the real-time data warehouse is constantly increasing. Moreover, as the volume of input data to the real time data warehouse increases, the interference between online loading ...
متن کاملThe Impact of Partitioned Fact Tables and Bitmap Index on Data Warehouse Performance
The design process is the most considered task of a data warehouse designer, this because of its performance criticality while going to production. Even if technology has developed in term of memory volume and speed, storage... the fact to minimize the execution time and the storage space still being a preoccupation for data warehousing specialists. This papers shows that using bitmap index and...
متن کاملComparison of Data Warehousing DBMS Platforms
Although relational databases (RDBMS) are the most common choice for data warehouse implementations, their record-based structure is far from ideal. As data volumes grow and users demand more sophisticated analytical capabilities, the deficiencies of the RDBMS to data storage become more conspicuous. RDBMS data warehouse systems are difficult to design; extremely inefficient in their use of dis...
متن کامل